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Abstract Fifteen years ago Monique Tirion showed that the low-frequency normal 
modes of a protein are not significantly altered when non-bonded interactions are 
replaced by Hookean springs, for all atom pairs whose distance is smaller than a 
given cutoff value. Since then, it has been shown that coarse-grained versions of 
Tirion's model are able to provide fair insights on many dynamical properties of 
biological macromolecules. In this chapter, theoretical tools required for studying 
these so-called Elastic Network Models are described, focusing on practical issues 
and, in particular, on possible artifacts. Then, an overview of some typical results 
that have been obtained by studying such models is given. 
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In 1996, Monique Tirion showed that the low-frequency normal modes of a pro- 
tein (see section [3TTT > are not significantly altered when Lennard- Jones and electro- 
static interactions are replaced by Hookean (harmonic) springs, for all atom pairs 
whose distance is smaller than a given cutoff value [1]- In the case of biological 
macromolecules, this seminal work happened to be the first study of an Elastic Net- 
work Model (ENM). The ENM considered was an all-atom one, chemical bonds 
and angles being kept fixed through the use of internal coordinates, as often done in 
previous standard normal mode studies of proteins []2] [3] E] ■ 

Soon afterwards, several coarse-grained versions of Tirion's ENM were pro- 
posed, in which each protein amino-acid residue is usually represented as a single 
bead and where most, if not all, chemical "details" are disregarded J5j|6), including 
atom types and amino-acid masses. 

Since then, it has been shown that such highly simplified protein models are 
able to provide fair insights on the dynamical properties of biological macro- 
molecules (5] [7] [8] [9], including those involved in their largest amplitude func- 
tional motions flT0l[TD . even in the case of large assemblies like RNA polymerase 
II Ifl2ll . transmembrane channels |[T3l[T4l . whole virus capsids |fT31l or even the ribo- 
some [ 16 1 . As a consequence, numerous applications have been proposed, notewor- 
thy for exploiting fiber diffraction data [17], solving difficult molecular replacement 
problems |[T8l[T9l , or for fitting atomic structures into low-resolution electron den- 
sity maps 1T9]|20]|2T]|22]|23]. 

However, the idea that simple models can prove enough for capturing major prop- 
erties of objects as complex as proteins had been put forward well before Tirion's 
introduction of ENMs in the realm of molecular biophysics. In the following, af- 
ter a brief account of previous results supporting this claim (section [2), theoretical 



Elastic Network Models 3 
tools required for studying an ENM are described (section |3), focusing on practical 
issues and, in particular, on possible artifacts. Then, an overview of typical results 
that have been obtained by studying protein ENMs is given (section|Ui. 



2 Background 

Indeed, coarse-grained models of proteins had been considered twenty years before 
M. Tirion's work, for studying what may well be the most complex phenomenon 
known at the molecular scale, namely, protein folding. Indeed, as soon as 1975, 
Michael Levitt and Arieh Warshel proposed to model a protein as a chain of beads, 
each bead corresponding to the C a atom of an amino-acid residue, the centroid of 
each amino-acid sidechain being taken into account with another bead grafted onto 
the chain [24). That same year, Nobuhiro Go and his collaborators proposed an 
even simpler model in which the chain of beads is mounted on a two-dimensional 
lattice, each bead corresponding either to a single residue or, more likely, to a sec- 
ondary structure element (e.g., an a-helix) of a protein 11251 . Moreover, while the 
Levitt- Warshel model had been designed so as to study a specific protein, that is, 
a polypeptidic chain with a given sequence of amino-acid residues, the Go model 
focuses on the conformation of the chain, more precisely, on the set of pairs of 
amino- acids that are interacting together in the chosen (native) structure. 

So, it is fair to view protein ENMs as off-lattice versions of the Go model. 

Lattice models of proteins have been studied extensively since then so as to gain, 
for instance, a better understanding of the sequence-structure relationship. Note- 
worthy, if the chain is short enough, all possible conformations on the lattice can 
be enumerated, allowing for accurate calculations of thermodynamic quantities and 



4 Y.-H. Sanejouand 

univoqual determination of the free energy minimum. Moreover, if the number of 
different amino-acids is small enough, then the whole sequence space can also be 
addressed. For instance, in the case of the tridimensional cubic lattice, a 27-mer 
chain has 103346 self-avoiding compact (i.e. cubic) conformations l26l . On the 
other hand, if only two kinds of amino-acids are retained, that is, if only their hy- 
drophobic or hydrophilic nature is assumed to be relevant for the understanding of 
protein stability, then a 27-mer has 2 27 different possible sequences. This is a large 
number, but it remains small enough so that for each sequence the lowest-energy 
compact conformation can be determined and, when a nearly-additive interaction 
energy is considered [27 1, the conclusion of such a systematic study happens to be 
an amazing one. Indeed, it was found that a few conformations (1% of them) are 
"preferred" by large sets of sequences lt28l . Moreover, although each of these sets 
forms a neutral net in the sequence space, it is often possible to "jump" from a 
preferred conformation to another, as a consequence of single-point mutations fl29l . 

While the former property is indeed expected to be a protein-like one, allowing to 
understand why proteins are able to accomodate so many different single-point mu- 
tations without significant loss of both their structure and function, it is only during 
the last few years that the latter one has been exhibited. In particular, using sequence 
design techniques, a pair of proteins with 95% sequence identity, but different folds 
and functions, was recently obtained 11301 . If generic enough, such a property would 
help to understand how the various protein folds nowadays found on earth may have 
been "discovered" during the earliest phases of life evolution (e.g. prebiotic ones), 
since discovering a first fold could have proved enough for having access to many 
other ones, a single-point mutation after another. 

In any case, this example shows how the study of simple models can help 
to think about, and maybe to understand better, major protein properties, in 
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particular because such models can be studied on a much larger scale than 
actual proteins. 

3 Theoretical foundations 

The vast majority of protein ENM studies rely on Normal Mode Analysis (NMA) J9j- 
Moreover, the hypotheses underlying this kind of analysis probably inspired the de- 
sign of the first ENM. Actually, in her seminal work, M. Tirion performed NMA in 
order to show that similar results can be obtained by studying an ENM or a protein 
described at a standard, semi-empirical, level JT]. So, hereafter, the principles of 
NMA are briefly recalled (more details can be found in classic textbooks I3T1I321 ). 
Next, the close relationship between NMA and the different types of ENMs is un- 
derlined. 

3.1 Normal Mode Analysis 

Newton's equations of motion for a set of N atoms can not be solved analytically 
when N is large (namely, N > 2), except in rare instances like the following, rather 
general, one. Indeed, for small enough displacements of the atoms in the vicinity 
of their equilibrium positions, V, the potential energy of the studied system, can be 
approximated by the first terms of a Taylor series: 

3N /dV\ 1 3N 3N ( F) 2 V \ 
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where r,- is the f" coordinate, rf, its equilibrium value, and Vb, the potential energy 
of the system at equilibrium. 

Since, within the frame of classical physics, the exact value of V is meaningless 
(only potential energy differences are expected to play a physical role), Vq can be 
zeroed. Moreover, since Vq is a minimum of V, for each coordinate: 



This yields: 

i 3N 3N / -)2y \ 

V=\Ll(j£) (^(rj-rj) (2) 

L i=l ;=1 \ ar ' ar J / 

In other words, if the atomic displacements around an equilibrium configuration 
are small enough, then the potential energy of a system can be approximated by a 
quadratic form. 

On the other hand, if the system is not under any constraint with an explicit time- 
dependence, then its kinetic energy can also be written as a quadratic form ||3TI 
and it is straightforward to show that, when both potential and kinetic energy func- 
tions are quadratic forms, then the equations of atomic motion have the following, 
analytical, solutions iTJTI l32l l33ll : 

j 3N 

n(t) = rf + —— V C k a ik cos{2KV k t + <P k ) (3) 

V™i k =i 

where m, is the atomic mass and where C k and <P k , the amplitude and phasis of 
the so-called normal mode of vibration k, depend upon the initial conditions, that 
is, upon atomic positions and velocities at time t = 0. Noteworthy, C k is a simple 
function of E k , the total energy of mode k. In particular, if all modes have identical 
total energies, then: 

Q = ^I (4) 
2nv k 
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where T is the temperature and kg the Boltzmann constant. This means that the am- 
plitude of mode k goes as the inverse of its frequency, Vj-. As a matter of fact, when 
NMA is performed in the case of proteins, using standard all-atom force-fields, it 
can be shown that modes with frequencies below 30-100 cm -1 are responsible for 
90-95% of the atomic displacements l34l . 

Note that such analytical solutions can provide various thermodynamic quan- 
tities like entropy, enthalpy, etc, and this, even at a quantum mechanical level 
of description [ 34l . 

In practice, the a^'s involved in eq.[3] which give the coordinate contributions to 
mode k, are obtained as the k th eigenvector of H, the mass-weighted Hessian of the 
potential energy, that is, the matrix whose element ij is: 



d 2 V 



^/mflfTjdridrj J Q 



(5) 



By definition, the 3N eigenvectors of a matrix like H form an orthogonal basis set. 
This means that, when k^l: 



Ikoqi / o 

where q k is the so-called normal coordinate, obtained by projecting the 3N mass- 
weighted cartesian coordinates onto eigenvector k, namely: 

3iV 

q k = £ a ik ^/mi(rj - rf ) (6) 

i 

Moreover, the eigenvalues of H, that is, the diagonal elements of the matrix ob- 
tained by expressing H in this new basis set, provide the 3A^ frequencies of the 
system since, for each mode k: 
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The eigenvalues and eigenvectors of a matrix are obtained by an operation called 
a diagonalization. In principle, for a real and symmetrical matrix like H, such an 
operation is always possible. At a practical level, when the matrix size is not too 
large, that is, if the matrix can be stored in the computer memory, algorithms and 
methods available in standard mathematical packages allow to get its eigenvalues 
and eigenvectors at a CPU cost raising as nN 2 , where n is the number of requested 
eigensolutions. In other words, it is rather straightforward to obtain analytical so- 
lutions for the atomic motions, as long as small-amplitude displacements around a 
given, well-defined, equilibrium configuration are considered. Note that for a tridi- 
mensional system at equilibrium, at least six zero eigenvalues have to be obtained 
(except if the system is linear, in which case there are five of them), corresponding to 
the six possible rigid-body motions (translations or rotations) of the entire system. 
However, if the system is not at equilibrium, negative eigenvalues are usually ob- 
served. Moreover, significant mixing between rotation modes and some others can 
occur, leaving three zero eigenvalues only, that is, those corresponding to the three 
translation modes of the system ||33l . 

The main drawback of NMA is obvious: the actual dynamics of a protein 
is much more complicated than assumed above. As a matter of fact, even 
on the short timescales considered within the frame of standard molecular 
dynamics simulations, a protein is able to jump from the attraction basin of an 
equilibrium configuration to another [35 1, and the number of these equilibrium 
configurations is so huge that it is unlikely for a nanosecond trajectory to visit 
one of them twice. In other words, while NMA focuses on protein dynamics 
at the level of a single minimum of the potential energy surface (PES), it is 
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well known that for proteins at room temperature the relevant PES is a higly 
complex, multi-minima, one. 

NMA has several other drawbacks. For instance, starting from a given protein 
structure, e.g., as found in the Protein Databank (PDB), an equilibrium configuration 
has to be reached. This is usually done using energy-minimization techniques. As a 
consequence, the structure studied with NMA and a standard force field is always 
a distorted one, the C a root-mean-square deviation (C a -r.m.s.d) from the initial 
structure being typically of 1 -2 A (9) . 

More importantly, within the frame of NMA, it is not obvious to take solvent 
effects into account, as the meaning of an equilibrium configuration in the case of 
an ensemble of molecules in the liquid state is unclear. As a matter of fact, the first 
NMA studies of proteins were performed in vacuo |2][3]H[36). Note that, nowadays, 
the availability of implicit solvent models, like EEF1 ||37l , offers a more satisfactory 
alternative. 

However, as shown below, the main idea underlying the design of protein ENMs 
is not only to ignore the well-known drawbacks of NMA but, building upon its 
empirical successes, to add a few more on top of them. 

3.2 The Elastic Network Model 

In essence, there are two different types of ENMs, which differ by their dimen- 
sionality. The Gaussian Network Model (GNM), proposed by Ivet Bahar, Burak Er- 
man and Turkan Haliloglu in 1997 |5][38], is a one-dimension model while Tirion's 
model, later called the Anisotropic Network Model Il39l (ANM), is a tridimensional 
one. 
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Although eq.|2]may look simple, it relies on a large number of parameters, namely, 
the elements of the Hessian matrix (eq. |5). In order to make it even simpler, M. 
Tirion proposed to replace eq. [2]by another quadratic form, namely: 



where dij is the actual distance between atoms i and j, djj being their distance in 
the studied structure [1 1. This amounts to set Hookean springs between all pairs of 
atoms less than R c Angstroms away from each other. Note that in Tirion's work, as 
well as in most ANM studies (there are notable exceptions BUI ). k enm , the spring 
force constant, is the same for all atom pairs. When it is so, the role of k enm is just 
to specify which system of units is used, R c being the only physically relevant pa- 
rameter of the model. In other words, when studying an ENM, the major drawback 
added with respect to standard NMA is that most atomic details are simply ignored. 

However, considering eq. [7] instead of eq. [2] has several practical advantages. 
First, an energy minimization is not required any more, since the configuration 
whose energy is the absolute minimum one (V = 0) is known: it is the studied one. 
As a corollary, results obtained by studying ENMs are easier to reproduce. Indeed, 
an energy minimization not only introduces unwanted distortions in a structure, but 
it does it in a way that strongly depends upon the most tiny details of the protocole 
used, this, also as a consequence of the huge number of minima of a realistic PES 
for a biological macromolecule. Last but not least, as a straightforward consequence 
of eq. [7] the elements of the Hessian matrix (see eq. |5j are as simple as): 




(7) 





x j)(y>-yj) 



(8) 
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where is the element corresponding to the x and y coordinates of atoms i and j. 



3.2.2 The Gaussian Network Model 

Because R c , the cutoff value of an ANM, is usually rather small (see section l3".2.3l >. 
the corresponding Hessian matrix is sparse, that is, most of its elements (eq. [HJ are 
zeroes. So, as proposed by I. Bahar, B. Erman and T. Haliloglu Q, it is tempting 
to go another step further into the simplification process and to consider the corre- 
sponding adjacency matrix, that is, the matrix whose elements are: 

hij = kenm (9) 

when residues ; and j are interacting (hij = otherwise). Note that in the case of 
an adjacency matrix, as well as for the Hessian matrix of an ANM, ha, the diagonal 
element i, is so that: 

&s = -J>y (10) 

Of course, with an adjacency matrix, information about directionality is miss- 
ing. This is a major drawback of GNMs since this means that studying a GNM 
can only provide informations about motion amplitudes. 

Note that GNMs are usually, if not always, set up at the residue level, while 
ANMs are sometimes studied at the atomic level, like in the seminal study of M. 
Tirion JT). From now on, to underline such (not so common) cases, these latter 
models will be coined "all-atom ANMs". 
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3.2.3 The cutoff issue 

The main, if not the only, parameter of an ENM is R c , Although several studies 
have tried to justify the choice of a particular value for this parameter, typically by 
comparing calculated and experimental quantities, cutoff values over a wide range 
are still of common use, varying between 7 PTI and 16A (8). 

For the most part, this probably reflects the fact the lowest-frequency modes of 
an ENM are usually "robust" [42], that is, little sensitive to the way the model is 
built. However, it is obvious that to be meaningful the value of R c has to be on the 
small side. Putting it to an extreme: in the case of a GNM (see section |3~.2.21 >, if R c 
is so large that the adjacency matrix is completely filled with non-zero elements, its 
eigenvalues and eigenvectors, apart from being degenerate, will only depend upon 
N, the size of the system, and not upon its topology or its shape. As a consequence, 
they can for sure not provide any useful information. On the other hand, if R c is too 
small, then the network of interacting residues is split into sub-networks, either free 
to rotate with respect to another one (in the case of an ANM) or completely indepen- 
dant from each other (in both ANM and GNM cases). Such dynamical properties 
are certainly not among those expected for a macromolecule, and this is why, in 
ANM studies, the smallest cutoff values used are of the order of 8-10A flO] [12), 
that is, larger than the typical distance between two interacting amino-acid residues 
in a protein, namely 6-7A l43l |44) . 

In practice, choosing a too small value for R c yields additional zero eigenval- 
ues. 

So, if more than one (for a GNM) or six (for an ANM) zero eigenvalues are 
obtained, then it is highly recommanded to increase R c . Note that GNMs allow for 
the use of smaller values of R c (a value of 7.3A is often chosen BT1 ) since in the 
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case of a mono-dimensional model a single connection is enough for avoiding any 
free translation of a group of atoms with respect to another. As a consequence, when 
a GNM is built with C a atoms picked from a single protein chain, that is, when all 
amino-acid residues are chemically bonded to each other through peptidic bonds, a 
value of R c as low as 4A (the typical distance between two consecutive C a atoms) 
can be used. 

At first sight, it may seem that problems with small cutoff values could be solved 
with a distance-dependant spring force constant, as early proposed by Konrad Hin- 
sen J6). However, it is clear that an exponential term, for instance, introduces a 
typical length which, when too small, yields similar artifacts. Indeed, in such a case, 
the additional free rigid-body motions obtained with a too small value for R c are 
expected to be replaced by low-frequency motions involving the same too little- 
connected groups of atoms. 

Note that with ENMs other kinds of spurious low-frequency motions can be ob- 
served. For instance, in crystal structures, protein N- and C-terminal ends are often 
found to extend away from the rest of the structure. As a consequence, large ampli- 
tude, usually meaningless, motions of these (almost) free ends can be found among 
the lowest-frequency modes. So, in order to obtain significant and clear-cut results, 
it is highly recommanded to begin an ENM study by "cleaning" the studied struc- 
ture, namely, by removing such free ends. 

A similar kind of spurious low-frequency motion can be observed with all-atom 
ANMs, in which groups of little-connected atoms are involved, typically those at the 
end of long sidechains ||45l . Note that an elegant way to cure such artifacts is to use 
the RTB approximation [46, 47 1, which allows to remove from the Hessian matrix 
all contributions associated to motions occuring inside each "block" the system is 
split into (RTB stands for Rotation-Translation of Blocks). In most cases, a block 
corresponds to a given amino-acid residue but, while atom-atom interactions are 
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taken into account when the atoms belong to different blocks, each block can also 
correspond to a whole protein subunit, allowing for the study of systems as large as 
entire virus capsids ||l5l . 

4 Empirical foundations 

As illustrated above, ENMs and NMA are closely related. As a consequence, the 
theoretical foundations of ENMs are for the most part those of NMA. However, 
when applied to complex molecular systems, NMA is known to have obvious draw- 
backs (see section [3TTT >. So, if NMA is still widely performed it is because of its 
empirical, sometimes unexpected, successes. As recalled below, most of these suc- 
cesses can also be achieved by studying ENMs. 

4.1 B- factors 

From eq.|3]and eq.@] it is straightforward to show that < Arf >, the fluctuation of 
coordinate i with respect to its equilibrium value, is so that: 



n nz being the number of non-zero frequency normal modes of the system, namely, 
n nz = N — 1 when a GNM is considered and n nz = 3N — 6 when it is an ANM. 
However, in practice, since such fluctuations scale as the inverse of v^, the mode 
frequency, a sum over the lowest-frequency normal modes of the system is usually 
enough for obtaining a fair approximation ||34l . 

On the other hand, B u the crystallographic Debye- Waller factor (the so-called 
isotropic B-factors) of atom i, is expected to be related to the fluctuations of its 
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atomic coordinates through: 

B t = — < Ax] + Ay] + Az] > (12) 

Although other physical factors are involved, like crystal disorder or lattice phonons, 
as well as non-physical ones, like the number of water molecules included in the 
structure refinment process by crystallographers, significant correlations between 
B-factor values predicted using eq. [TTlfT2l and experimentally obtained ones have 
been reported in numerous cases. 

For instance, in a study of 30 protein GNMs (R t = 7. 5 A), a mean value of 
0.62 ±0.13 for this correlation coefficient was found 0. Interestingly, in the same 
study, 26 other proteins were considered, for which accurate relaxation measure- 
ments had been measured by NMR, and the mean correlation between the corre- 
sponding fluctuations and those obtained using eq.[TT]was found to be significantly 
higher, namely, 0.76 ± 0.04, a remarkable agreement with the experimental data 
being achieved in several cases, with a correlation coefficient over 0.9 for four of 
them [7]. Amazingly, ANMs do not perform significantly better. For instance, in 
a study of 83 proteins (R c = 16A), a mean value for the correlation coefficient of 
0.68 ±0.1 1 between predicted and isotropic B-factors was obtained [[8] while, using 
the all-atom ANM (R c = 5 A) implemented in the Elnemo webserver lfl4l . which 
makes use of the RTB approximation l46ll47l . a very similar value of 0.68 ±0.13 
was found (8). 

Note that in both studies mentioned above, when eq.[TT]was used, overall trans- 
lations or rotations of the entire protein within the crystal cell were excluded from 
the calculation, while it is well known that such motions are able to provide by 
themselves good correlations with experimental values B81 . In other words, much 
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better correlations with experimental B-factors can be obtained by mixing NMA 
predictions with protein rigid-body motions, the latters accounting partly for crystal 
disorder, but mostly for the phonon modes of the whole crystal. Interestingly, these 
latter modes can be taken into account within the frame of ENM studies, simply by 
including all crystal cell symmetries in the model [49 l50ll5D . 

Of course, such significant correlations with experimental data can only be ob- 
tained because the amplitude of atomic thermal fluctuations scales as the inverse 
of mode frequencies (see eq. [TTV Indeed, with crude models like ENMs, the ac- 
tual high-frequency modes of a protein can not be predicted, because such modes 
strongly depend upon the chemical details of the structure, only a few neighbor- 
ing atoms (e.g., covalently bonded ones) being involved in the highest-frequency 
modes. This does not mean, though, that the high-frequency modes of an ENM can 
not bring any useful information. Indeed, they correspond to local motions occur- 
ring within the parts of the structure whose density is the highest l38l . Moreover, it 
has been shown that such regions often ly nearby enzyme active sites l52ll53l . 

On the other hand, the B-factor values themselves can not directly be obtained by 
studying ENMs, since their average is proportional to k enm . Indeed, it is customary 
to choose k enm so as to match average experimental B-factor values []9]- Another 
common way is to choose k enm so as to reproduce the lowest-frequency of the sys- 
tem, as obtained using all-atom force-fields |52l . 

4.2 The relationship with protein functional motions 

The seminal paper of M. Tirion ends with the statement that [fl~): 

Tests performed on a periplasmic maltodextrin binding protein (MBP) indicate that the 
slowest modes do indeed closely map the open form into the closed form (Tirion, in prepa- 
ration). 



Elastic Network Models 



17 




Fig. 1 Left: the open (ligand-free) form of maltodextrin binding protein (PDB identifier lOMP). 
Right: the corresponding Elastic Network Model. Pairs of C a atoms are linked by springs (plain 
lines) when they are less than 8 A from each other. Drawn with Molscript 1541 . 

The next paper of M. Tirion never came out but her result was confirmed a few 
years later, as part of a study of 20 protein ENMs (R t = 8 A) in both their ligand- 
free (open) and ligand-bound (closed) forms iTTOl . Indeed, for MBP, it was found 
that the overlap between its second lowest-frequency mode and its functional con- 
formational change is close to 0.9. This means that 80% of the functional motion of 
MBP can be described by varying the normal coordinate associated to a single of its 
modes. Indeed, O^, the overlap with mode k, is given by: 



where A n is the variation of coordinate i between the open and the closed form after 
both structures have been superimposed (55]. On the other hand, since the modes of 
MBP form an orthogonal basis set, the following property holds: 



(13) 




1,01 = 1 



(14) 



k=l 



More generally, it was found that when the conformational change of a pro- 
tein upon ligand binding happens to be highly collective, one of its low-frequency 
normal modes often compares well with the experimental motion (overlap over 
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0.5 IflOl ). Since then, a study of nearly 4,000 cases has confirmed this result IfTTI . 
while another study of a set of proteins with similar functions and shapes, but 
various folds, namely DNA-dependant polymerases [12], has shown that the low- 
frequency modes of a protein, and hence the nature of its large amplitude motions, 
are likely to be determined by its shape ifTUl 1561 l57l . 

Indeed, this latter point has recently been confirmed in a rather direct way, by 
considering ENMs built in such a way that each amino-acid interacts with a given 
number of neighbors (the closest ones). Then, at variance with cutoff -based ENMs, 
the rigidity of the system is fairly constant from a site to another. However, the 
relationship between the lowest-frequency modes of a protein and its functional 
motion is preserved. Specifically, it was found that the subspace defined by up to the 
10-12 lowest-frequency modes of a protein is conserved, whatever model is used. 
Moreover, when no such, so-called robust, subspace exists, the fonctional motion of 
the protein is found to be either localized and/or of small amplitude (typically: less 
than 2-3Aof Ca-r.m.s.d) l42l . 

In retrospect, these results make sense. First, a strong relationship between low- 
frequency modes and protein fonctional motions was first observed within the frame 
of NMA studies performed at a highly detailed, atomic level of description, note- 
worthy in the cases of lysozyme (58], hexokinase ||59l , citrate synthase lf55l and 
hemoglobin ll60l . Since, as recalled above, it was later found that such a relation- 
ship also holds when most chemical details are removed, it is clear that the property 
captured by NMA has to be a very general one. On the other hand, K. Hinsen has 
convincingly shown that the low-frequency modes of a protein can be used to split its 
structure into well-defined domains [6 |, with the additional advantage of a smooth, 
almost continuous, description of their boundaries. So, since it is well known that 
most large amplitude protein functional motions can be well described as combi- 
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nations of almost rigid-body motions of entire structural domains 16111621 , the rela- 
tionship found between these motions and the low-frequency modes of ENMs is just 
another demonstration that whole quasi-rigid domain motions are involved in such 
modes. On the other hand, it is not that difficult to admit that the spatial clustering 
of amino-acids into domains can be revealed by studying protein dynamical prop- 
erties, even at a crude level of description. A corollary of this line of thought is that 
ENMs should perform better, as far as low-frequency and large amplitude motions 
are concerned, in the case of large, multi-domain systems. 



4.3 Applications 

As illustrated above, NMA of ENMs seems to have a clear predictive power. So, 
given both the simplicity of these models and their coarse-grained nature, many 
applications have been proposed. For instance, as early suggested, being able to 
guess the pattern of atomic fluctuations through eq.QT]may prove useful for refining 
crystal structures Il63ll64l . 

However, most applications take advantage of the possibility to predict atomic 
displacements through the reciproqual of eq. |6] namely: 

I n sub 

n = rf + —p= T, a ik q k (15) 

where n su t, is the number of low-frequency modes considered to be enough for per- 
forming an accurate prediction. In the simplest case, mode amplitudes can be varied 
arbitrarily, one mode after the other. Indeed, in the light of enough experimental 
data, the analysis of such trajectories can prove enough for getting insights about 
the nature of the functional motion of a protein [13., 65 1. Some of the conformations 
thus obtained can also allow for solving difficult molecular-replacement problems, 
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although it is often necessary to explore at least a couple of modes in order to reach 
a useful conformation lfl8l . More generally, eq. [I5]can be used so as to reduce the 
dimensionality of the system and, thus, to find more easily protein conformations 
fulfiling a given set of constraints. For instance, it has been used for fitting known 
structures into low -resolution electron density maps |fl9] l20l |2TI l23l providing, for 
instance, more detailled structural data for systems of major interest, like the ribo- 
some 1221 . 

Note that eq.[l5]is linear. As a consequence, atom motions follow straight lines 
and local distorsions (of most chemical bonds, valence angles, etc) can not 
be avoided. So, for many applications, as well as for obtaining well-behaved 
normal mode trajectories, the conformations thus generated need to be "regu- 
larized" |T8], using for instance a detailled all-atom force-field and standard 
energy-minimization techniques. 



5 Conclusion 

Fifteen years after their introduction in the realm of molecular biophysics (TJ, thanks 
to their simplicity as well as to their coarse-grained nature, Elastic Network Models 
are becoming more and more popular. Indeed, many applications have been pro- 
posed, noteworthy within the frame of various structural biology techniques. 

From a theoretical point of view, their relationship with Normal Mode Analysis is 
obvious, since both approaches rely on a quadratic form for the energy function, the 
former, par definition, the latter, as a consequence of a small displacement, so-called 
harmonic (or linear) approximation. 
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From an empirical point of view, it has been extensively shown that normal mode 
studies of Elastic Network Models yield low-frequency, large amplitude and collec- 
tive, motions which prove often similar to those obtained with an all-atom model 
and a standard empirical force-field. 

This is likely to be a consequence of the robusteness of these motions [42 1. More- 
over, such motions often provide fair predictions for the pattern of thermal atomic 
fluctuations (e.g. the crystallographic B-factors) or for the kind of functional motion 
a given protein can perform (e.g. its conformational change upon ligand binding). 
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